Search CORE

64 research outputs found

Using product indicators in restricted factor analysis models to detect nonuniform measurement bias

Author: A Satorra
CM Woods
DA Kenny
FJ Oort
FJ Oort
G Schwartz
G-C Lin
GJ Mellenbergh
HW Marsh
J Henseler
LK Muthén
MT Barendse
MT Barendse
N Umbach
RJ Vandenberg
TD Little
Y Rosseel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Phenotypic Complexity, Measurement Bias, and Poor Phenotypic Resolution Contribute to the Missing Heritability Problem in Genetic Association Studies

Background The variance explained by genetic variants as identified in (genome-wide) genetic association studies is typically small compared to family-based heritability estimates. Explanations of this ‘missing heritability’ have been mainly genetic, such as genetic heterogeneity and complex (epi-)genetic mechanisms. Methodology We used comprehensive simulation studies to show that three phenotypic measurement issues also provide viable explanations of the missing heritability: phenotypic complexity, measurement bias, and phenotypic resolution. We identify the circumstances in which the use of phenotypic sum-scores and the presence of measurement bias lower the power to detect genetic variants. In addition, we show how the differential resolution of psychometric instruments (i.e., whether the instrument includes items that resolve individual differences in the normal range or in the clinical range of a phenotype) affects the power to detect genetic variants. Conclusion We conclude that careful phenotypic data modelling can improve the genetic signal, and thus the statistical power to identify genetic variants by 20-99

Public Library of Science (PLOS)

CiteSeerX

Crossref

VU Research Portal

Directory of Open Access Journals

PubMed Central

UvA-DARE

International Migration, Integration and Social Cohesion online publications

On environment difficulty and discriminating power

Author: AE Elo
B Hibbard
CK Low
DK Hardman
DL Dowe
DL Dowe
DM Barch
DZ Du
GA Miller
GJ Chaitin
GJ Mellenbergh
H Zenil
HA Simon
IP Gent
J Anderson
J He
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Insa-Cabrera
J Insa-Cabrera
José Hernández-Orallo
K Arai
K Kotovsky
L Antunes
L Busoniu
LA Levin
LA Levin
M Dastani
M Li
MG Madden
P Orponen
PJ Ferrando
PJ Ferrando
R Team
RH Bordini
S Gruner
S Legg
S Whiteson
S Wolfram
SE Embretson
Z Zatuchna
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2015
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/s10458-014-9257-1This paper presents a way to estimate the difficulty and discriminating power of any task instance. We focus on a very general setting for tasks: interactive (possibly multiagent) environments where an agent acts upon observations and rewards. Instead of analysing the complexity of the environment, the state space or the actions that are performed by the agent, we analyse the performance of a population of agent policies against the task, leading to a distribution that is examined in terms of policy complexity. This distribution is then sliced by the algorithmic complexity of the policy and analysed through several diagrams and indicators. The notion of environment response curve is also introduced, by inverting the performance results into an ability scale. We apply all these concepts, diagrams and indicators to two illustrative problems: a class of agent-populated elementary cellular automata, showing how the difficulty and discriminating power may vary for several environments, and a multiagent system, where agents can become predators or preys, and may need to coordinate. Finally, we discuss how these tools can be applied to characterise (interactive) tasks and (multi-agent) environments. These characterisations can then be used to get more insight about agent performance and to facilitate the development of adaptive tests for the evaluation of agent abilities.I thank the reviewers for their comments, especially those aiming at a clearer connection with the field of multi-agent systems and the suggestion of better approximations for the calculation of the response curves. The implementation of the elementary cellular automata used in the environments is based on the library 'CellularAutomaton' by John Hughes for R [58]. I am grateful to Fernando Soler-Toscano for letting me know about their work [65] on the complexity of 2D objects generated by elementary cellular automata. I would also like to thank David L. Dowe for his comments on a previous version of this paper. This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project PROMETEO/2008/051, the COST - European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economia y Competitividad in Spain (PCIN-2013-037).José Hernández-Orallo (2015). On environment difficulty and discriminating power. Autonomous Agents and Multi-Agent Systems. 29(3):402-454. https://doi.org/10.1007/s10458-014-9257-1S402454293Anderson, J., Baltes, J., & Cheng, C. T. (2011). Robotics competitions as benchmarks for ai research. The Knowledge Engineering Review, 26(01), 11–17.Andre, D., & Russell, S. J. (2002). State abstraction for programmable reinforcement learning agents. In Proceedings of the National Conference on Artificial Intelligence (pp. 119–125). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.Antunes, L., Fortnow, L., van Melkebeek, D., & Vinodchandran, N. V. (2006). Computational depth: Concept and applications. Theoretical Computer Science, 354(3), 391–404. Foundations of Computation Theory (FCT 2003), 14th Symposium on Fundamentals of Computation Theory 2003.Arai, K., Kaminka, G. A., Frank, I., & Tanaka-Ishii, K. (2003). Performance competitions as research infrastructure: Large scale comparative studies of multi-agent teams. Autonomous Agents and Multi-Agent Systems, 7(1–2), 121–144.Ashcraft, M. H., Donley, R. D., Halas, M. A., & Vakali, M. (1992). Chapter 8 working memory, automaticity, and problem difficulty. In Jamie I.D. Campbell (Ed.), The nature and origins of mathematical skills, volume 91 of advances in psychology (pp. 301–329). North-Holland.Ay, N., Müller, M., & Szkola, A. (2010). Effective complexity and its relation to logical depth. IEEE Transactions on Information Theory, 56(9), 4593–4607.Barch, D. M., Braver, T. S., Nystrom, L. E., Forman, S. D., Noll, D. C., & Cohen, J. D. (1997). Dissociating working memory from task difficulty in human prefrontal cortex. Neuropsychologia, 35(10), 1373–1380.Bordini, R. H., Hübner, J. F., & Wooldridge, M. (2007). Programming multi-agent systems in AgentSpeak using Jason. London: Wiley. com.Boutilier, C., Reiter, R., Soutchanski, M., Thrun, S. et al. (2000). Decision-theoretic, high-level agent programming in the situation calculus. In Proceedings of the National Conference on Artificial Intelligence (pp. 355–362). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(2), 156–172.Chaitin, G. J. (1977). Algorithmic information theory. IBM Journal of Research and Development, 21, 350–359.Chedid, F. B. (2010). Sophistication and logical depth revisited. In 2010 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) (pp. 1–4). IEEE.Cheeseman, P., Kanefsky, B. & Taylor, W. M. (1991). Where the really hard problems are. In Proceedings of IJCAI-1991 (pp. 331–337).Dastani, M. (2008). 2APL: A practical agent programming language. Autonomous Agents and Multi-agent Systems, 16(3), 214–248.Delahaye, J. P. & Zenil, H. (2011). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation, 219(1), 63–77Dowe, D. L. (2008). Foreword re C. S. Wallace. Computer Journal, 51(5), 523–560. Christopher Stewart WALLACE (1933–2004) memorial special issue.Dowe, D. L., & Hernández-Orallo, J. (2012). IQ tests are not for machines, yet. Intelligence, 40(2), 77–81.Du, D. Z., & Ko, K. I. (2011). Theory of computational complexity (Vol. 58). London: Wiley-Interscience.Elo, A. E. (1978). The rating of chessplayers, past and present (Vol. 3). London: Batsford.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum.Fatès, N. & Chevrier, V. (2010). How important are updating schemes in multi-agent systems? an illustration on a multi-turmite model. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1 (pp. 533–540). International Foundation for Autonomous Agents and Multiagent Systems.Ferber, J. & Müller, J. P. (1996). Influences and reaction: A model of situated multiagent systems. In Proceedings of Second International Conference on Multi-Agent Systems (ICMAS-96) (pp. 72–79).Ferrando, P. J. (2009). Difficulty, discrimination, and information indices in the linear factor analysis model for continuous item responses. Applied Psychological Measurement, 33(1), 9–24.Ferrando, P. J. (2012). Assessing the discriminating power of item and test scores in the linear factor-analysis model. Psicológica, 33, 111–139.Gent, I. P., & Walsh, T. (1994). Easy problems are sometimes hard. Artificial Intelligence, 70(1), 335–345.Gershenson, C. & Fernandez, N. (2012). Complexity and information: Measuring emergence, self-organization, and homeostasis at multiple scales. Complexity, 18(2), 29–44.Gruner, S. (2010). Mobile agent systems and cellular automata. Autonomous Agents and Multi-agent Systems, 20(2), 198–233.Hardman, D. K., & Payne, S. J. (1995). Problem difficulty and response format in syllogistic reasoning. The Quarterly Journal of Experimental Psychology, 48(4), 945–975.He, J., Reeves, C., Witt, C., & Yao, X. (2007). A note on problem difficulty measures in black-box optimization: Classification, realizations and predictability. Evolutionary Computation, 15(4), 435–443.Hernández-Orallo, J. (2000). Beyond the turing test. Journal of Logic Language & Information, 9(4), 447–466.Hernández-Orallo, J. (2000). On the computational measurement of intelligence factors. In A. Meystel (Ed.), Performance metrics for intelligent systems workshop (pp. 1–8). Gaithersburg, MD: National Institute of Standards and Technology.Hernández-Orallo, J. (2000). Thesis: Computational measures of information gain and reinforcement in inference processes. AI Communications, 13(1), 49–50.Hernández-Orallo, J. (2010). A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In M. Hutter et al. (Ed.), 3rd International Conference on Artificial General Intelligence (pp. 182–183). Atlantis Press Extended report at http://users.dsic.upv.es/proy/anynt/unbiased.pdf .Hernández-Orallo, J., & Dowe, D. L. (2010). Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence, 174(18), 1508–1539.Hernández-Orallo, J., Dowe, D. L., España-Cubillo, S., Hernández-Lloreda, M. V., & Insa-Cabrera, J. (2011). On more realistic environment distributions for defining, evaluating and developing intelligence. In J. Schmidhuber, K. R. Thórisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 82–91). Berlin: Springer.Hernández-Orallo, J., Dowe, D. L., & Hernández-Lloreda, M. V. (2014). Universal psychometrics: Measuring cognitive abilities in the machine kingdom. Cognitive Systems Research, 27, 50–74.Hernández-Orallo, J., Insa, J., Dowe, D. L. & Hibbard, B. (2012). Turing tests with turing machines. In A. Voronkov (Ed.), The Alan Turing Centenary Conference, Turing-100, Manchester, 2012, volume 10 of EPiC Series (pp. 140–156).Hernández-Orallo, J. & Minaya-Collado, N. (1998). A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In Proceedings of International Symposium of Engineering of Intelligent Systems (EIS’98) (pp. 146–163). ICSC Press.Hibbard, B. (2009). Bias and no free lunch in formal measures of intelligence. Journal of Artificial General Intelligence, 1(1), 54–61.Hoos, H. H. (1999). Sat-encodings, search space structure, and local search performance. In 1999 International Joint Conference on Artificial Intelligence (Vol. 16, pp. 296–303).Insa-Cabrera, J., Benacloch-Ayuso, J. L., & Hernández-Orallo, J. (2012). On measuring social intelligence: Experiments on competition and cooperation. In J. Bach, B. Goertzel, & M. Iklé (Eds.), AGI, volume 7716 of lecture notes in computer science (pp. 126–135). Berlin: Springer.Insa-Cabrera, J., Dowe, D. L., España-Cubillo, S., Hernández-Lloreda, M. V., & Hernández-Orallo, J. (2011). Comparing humans and AI agents. In J. Schmidhuber, K. R. Thórisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 122–132). Berlin: Springer.Knuth, D. E. (1973). Sorting and searching, volume 3 of the art of computer programming. Reading, MA: Addison-Wesley.Kotovsky, K., & Simon, H. A. (1990). What makes some problems really hard: Explorations in the problem space of difficulty. Cognitive Psychology, 22(2), 143–183.Legg, S. (2008). Machine super intelligence. PhD thesis, Department of Informatics, University of Lugano, June 2008.Legg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444.Leonetti, M. & Iocchi, L. (2010). Improving the performance of complex agent plans through reinforcement learning. In Proceedings of the 2010 International Conference on Autonomous Agents and Multiagent Systems (Vol. 1, pp. 723–730). International Foundation for Autonomous Agents and Multiagent Systems.Levin, L. A. (1973). Universal sequential search problems. Problems of Information Transmission, 9(3), 265–266.Levin, L. A. (1986). Average case complete problems. SIAM Journal on Computing, 15, 285.Li, M., & Vitányi, P. (2008). An introduction to Kolmogorov complexity and its applications (3rd ed.). Berlin: Springer.Low, C. K., Chen, T. Y., & Rónnquist, R. (1999). Automated test case generation for bdi agents. Autonomous Agents and Multi-agent Systems, 2(4), 311–332.Madden, M. G., & Howley, T. (2004). Transfer of experience between reinforcement learning environments with progressive difficulty. Artificial Intelligence Review, 21(3), 375–398.Mellenbergh, G. J. (1994). Generalized linear item response theory. Psychological Bulletin, 115(2), 300.Michel, F. (2004). Formalisme, outils et éléments méthodologiques pour la modélisation et la simulation multi-agents. PhD thesis, Université des sciences et techniques du Languedoc, Montpellier.Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81.Orponen, P., Ko, K. I., Schöning, U., & Watanabe, O. (1994). Instance complexity. Journal of the ACM (JACM), 41(1), 96–121.Simon, H. A., & Kotovsky, K. (1963). Human acquisition of concepts for sequential patterns. Psychological Review, 70(6), 534.Team, R., et al. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Whiteson, S., Tanner, B., & White, A. (2010). The reinforcement learning competitions. The AI Magazine, 31(2), 81–94.Wiering, M., & van Otterlo, M. (Eds.). (2012). Reinforcement learning: State-of-the-art. Berlin: Springer.Wolfram, S. (2002). A new kind of science. Champaign, IL: Wolfram Media.Zatuchna, Z., & Bagnall, A. (2009). Learning mazes with aliasing states: An LCS algorithm with associative perception. Adaptive Behavior, 17(1), 28–57.Zenil, H. (2010). Compression-based investigation of the dynamical properties of cellular automata and other systems. Complex Systems, 19(1), 1–28.Zenil, H. (2011). Une approche expérimentale à la théorie algorithmique de la complexité. PhD thesis, Dissertation in fulfilment of the degree of Doctor in Computer Science, Université de Lille.Zenil, H., Soler-Toscano, F., Delahaye, J. P. & Gauvrit, N. (2012). Two-dimensional kolmogorov complexity and validation of the coding theorem method by compressibility. arXiv, preprint arXiv:1212.6745

Crossref

RiuNet

Assessment and Etiology of Attention Deficit Hyperactivity Disorder and Oppositional Defiant Disorder in Boys and Girls

Author: American Psychiatric Association
B Maughan
CK Conners
CK Conners
Conor V. Dolan
CV Dolan
DI Boomsma
DI Boomsma
Dorret I. Boomsma
E Simonoff
E Vierikko
EJCG Oord van den
EM Derks
Eske M. Derks
F Levy
GH Lubke
GJ Mellenbergh
J Biederman
J Fantuzzo
Jim J. Hudziak
JJ Hudziak
KG Jöreskog
KJ Saudino
L Hu
M Gaub
M Lynch
MC Neale
Michael C. Neale
MJH Rietveld
MW Browne
N Martin
PM Bentler
R Loeber
RD Bock
W Meredith
Publication venue: Kluwer Academic Publishers-Plenum Publishers
Publication date: 01/01/2007
Field of study

Attention deficit hyperactivity disorder (ADHD) and oppositional defiant disorder (ODD) are more common in boys than girls. In this paper, we investigated whether the prevalence differences are attributable to measurement bias. In addition, we examined sex differences in the genetic and environmental influences on variation in these behaviors. Teachers completed the Conners Teacher Rating Scale-Revised:Short version (CTRS-R:S) in a sample of 800 male and 851 female 7-year-old Dutch twins. No sex differences in the factor structure of the CTRS-R:S were found, implying the absence of measurement bias. The heritabilities for both ADHD and ODD were high and were the same in boys and girls. However, partly different genes are expressed in boys and girls

Crossref

VU Research Portal

Springer - Publisher Connector

PubMed Central

International Migration, Integration and Social Cohesion online publications

Testing the Assumption of Measurement Invariance in the SAMHSA Mental Health and Alcohol Abuse Stigma Assessment in Older Adults

Author: AT Beck
B King-Kallimanis
Bellinda L. King-Kallimanis
BG Link
BG Link
BG Link
CF Reynolds
CV Dolan
D Mechanic
D Sheehan
DB Wagenaar
ESL Gomberg
FC Blow
FJ Oort
FJ Oort
Frans J. Oort
GJ Mellenbergh
GS Howard
J Cohen
JA Sirey
JE Ware
KM Keyes
KZ Bambauer
Lawrence Schonfeld
LS Radloff
MAG Sprangers
MW Browne
N Graham
N Schmitt
Nancy Lynn
RJ Vandenberg
SE Levkoff
SM Smith
W Meredith
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Response shift in patient-reported outcomes:definition, theory, and a revised model

Author: A Barrington
A Guilleux
A Tversky
A Vanier
A Vanier
AC Michalos
AL Stanton
AT Panter
B Colburn
BB Reeve
BB Reeve
BD Rapkin
BD Rapkin
CE Schwartz
CE Schwartz
CE Schwartz
CL Park
CS Carver
E Barnes
E Diener
E Wetzel
F Lievens
FJ Oort
FJ Oort
FJ Oort
FJ Oort
G Norman
GJ Mellenbergh
GS Howard
GS Howard
GW Donaldson
IB Wilson
J Brandtstädter
JA Finkelstein
JB Grace
JM Baker
JR Boehnke
L Boyer
L Festinger
L McClimans
LM Collins
M Andrykowski
M Blanchin
M Mishel
M Ross
MAG Sprangers
MAG Sprangers
MGE Verdam
NE Mayo
NE Mayo
P Brickman
PA Ubel
PA Ubel
PM Fayers
R Lazarus
R Nerenz
R Sawatzky
R Tourangeau
RT Golembiewski
S Mukherjee
TT Sajobi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/04/2021
Field of study

International audiencePurpose The extant response shift definitions and theoretical response shift models, while helpful, also introduce predicaments and theoretical debates continue. To address these predicaments and stimulate empirical research, we propose a more specific formal definition of response shift and a revised theoretical model. Methods This work is an international collaborative effort and involved a critical assessment of the literature. Results Three main predicaments were identified. First, the formal definitions of response shift need further specification and clarification. Second, previous models were focused on explaining change in the construct intended to be measured rather than explaining the construct at multiple time points and neglected the importance of using at least two time points to investigate response shift. Third, extant models do not explicitly distinguish the measure from the construct. Here we define response shift as an effect occurring whenever observed change (e.g., change in patient-reported outcome measures (PROM) scores) is not fully explained by target change (i.e., change in the construct intended to be measured). The revised model distinguishes the measure (e.g., PROM) from the underlying target construct (e.g., quality of life) at two time points. The major plausible paths are delineated, and the underlying assumptions of this model are explicated. Conclusion It is our hope that this refined definition and model are useful in the further development of response shift theory. The model with its explicit list of assumptions and hypothesized relationships lends itself for critical, empirical examination. Future studies are needed to empirically test the assumptions and hypothesized relationships

Crossref

HAL Descartes

HAL Université de Tours

University of Dundee Online Publications

A proof of principle for using adaptive testing in routine Outcome Monitoring: the efficiency of the Mood and Anxiety Symptoms Questionnaire -Anhedonic Depression CAT

Abstract Background In Routine Outcome Monitoring (ROM) there is a high demand for short assessments. Computerized Adaptive Testing (CAT) is a promising method for efficient assessment. In this article, the efficiency of a CAT version of the Mood and Anxiety Symptom Questionnaire, - Anhedonic Depression scale (MASQ-AD) for use in ROM was scrutinized in a simulation study. Methods The responses of a large sample of patients (<it>N </it>= 3,597) obtained through ROM were used. The psychometric evaluation showed that the items met the requirements for CAT. In the simulations, CATs with several measurement precision requirements were run on the item responses as if they had been collected adaptively. Results CATs employing only a small number of items gave results which, both in terms of depression measurement and criterion validity, were only marginally different from the results of a full MASQ-AD assessment. Conclusions It was concluded that CAT improved the efficiency of the MASQ-AD questionnaire very much. The strengths and limitations of the application of CAT in ROM are discussed.</p

Crossref

VU Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Leiden University Scholary Publications

Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes

Author: AC Dueck
AM Boyd
BB Reeve
BB Reeve
BF Green
C Wang
C-H Chang
CAW Glas
CG Forero
D Thissen
DG Seo
DJ Thissen
DJ Weiss
DO Segall
DSJ Costa
E Basch
Food and Drug Administration
G Flens
G Maruyama
GJ Mellenbergh
J Brazier
J Speight
JA Landsheer
Jan R. Böhnke
JC Nunnally
JR Edwards
JS Gorin
KA Bollen
KJ Yost
L Cai
L Cai
L Yao
M Brod
M Doostfatemeh
M Heo
M Martin
MAG Sprangers
MC Edwards
MCS Paap
MCS Paap
MCS Paap
MD Reckase
Muirne C. S. Paap
MW Browne
N Deng
N Smits
N Smits
N Smits
Niels Smits
OS Chernyshenko
P Fayers
P Levy
P Michel
PM Fayers
PM Fayers
R Holman
RC MacCallum
RJ Adams
RJ Swartz
RJD Ayala
RK Tsutakawa
RM Luecht
RP Chalmers
S Jiang
SE Embretson
SM Wu
SP Reise
SP Reise
SW Choi
T Hastie
V Sebille
W Bonifay
W-C Wang
WA Nicewander
WHM Emons
Y Zheng
YH Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2018
Field of study

PURPOSE: Multidimensional item response theory and computerized adaptive testing (CAT) are increasingly used in mental health, quality of life (QoL), and patient-reported outcome measurement. Although multidimensional assessment techniques hold promises, they are more challenging in their application than unidimensional ones. The authors comment on minimal standards when developing multidimensional CATs. METHODS: Prompted by pioneering papers published in QLR, the authors reflect on existing guidance and discussions from different psychometric communities, including guidelines developed for unidimensional CATs in the PROMIS project. RESULTS: The commentary focuses on two key topics: (1) the design, evaluation, and calibration of multidimensional item banks and (2) how to study the efficiency and precision of a multidimensional item bank. The authors suggest that the development of a carefully designed and calibrated item bank encompasses a construction phase and a psychometric phase. With respect to efficiency and precision, item banks should be large enough to provide adequate precision over the full range of the latent constructs. Therefore CAT performance should be studied as a function of the latent constructs and with reference to relevant benchmarks. Solutions are also suggested for simulation studies using real data, which often result in too optimistic evaluations of an item bank's efficiency and precision. DISCUSSION: Multidimensional CAT applications are promising but complex statistical assessment tools which necessitate detailed theoretical frameworks and methodological scrutiny when testing their appropriateness for practical applications. The authors advise researchers to evaluate item banks with a broad set of methods, describe their choices in detail, and substantiate their approach for validation

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

University of Dundee Online Publications

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Dissertations of the University of Groningen

Beyond trial types

Author: C Bundesen
C Bundesen
C Bundesen
CE McCulloch
Claus Bundesen
CR Gillebert
G Hinton
GJ Mellenbergh
H Shibuya
IH Robertson
J Duncan
J Hung
J Hung
JF Cavanagh
JP O’Doherty
K Knoblauch
M Dyrholm
Mads Dyrholm
NJ Rouder
RS Woodworth
S Kyllingsbæk
S Kyllingsbæk
S Vangkilde
S Vangkilde
Signe Vangkilde
T Habekost
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Assessing the adequacy of self-reported alcohol abuse measurement across time and ethnicity: cross-cultural equivalence across Hispanics and Caucasians in 1992, non-equivalence in 2001–2002

Author: AC Carle
AC Carle
AC Carle
AC Carle
AC Carle
Adam C Carle
AE Kazdin
AL Stewart
American Psychiatric Association
B Byrne
B Muthén
BD Smedley
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BF Grant
BM Byrne
BM Byrne
BM Byrne
BM Byrne
BO Muthén
BO Muthén
BO Muthén
CA McHorney
CD Huang
CH Hui
D Borsboom
D Hasin
D Hasin
D Thissen
DA Cole
DA Cole
DA Dawson
DA Dawson
DS Hasin
DS Hasin
DS Hasin
DS Hasin
GJ Mellenbergh
GN Wright
GP Knight
GP Knight
GW Cheung
H Harwood
H Prelow
Harford
HM Prelow
J Schafer
JA Teresi
JA Teresi
JG Bachman
JH Steiger
JJ Gallo
K Fiscella
KA Bollen
L Hu
L Hu
L Smith
LA Greenfield
LK Muthén
M Ramírez
MA Pentz
MG Bloche
NG Waller
PB Smith
R Reid
R Steinbrook
RC Kessler
RE Millsap
RE Millsap
RE Millsap
RE Millsap
RE Millsap
S Chatterji
S Sue
SB Green
SM Stahl
SR Cole
T Saha
W Meredith
Y Takane
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Do estimates of alcohol abuse reflect true levels across United States Hispanics and non-Hispanic Caucasians, or does culturally-based, systematic measurement error (i.e., measurement bias) affect estimates? Likewise, given that recent estimates suggest alcohol abuse has increased among US Hispanics, the field should also ask, "Does cross-ethnic change in alcohol abuse across time reflect true change or does measurement bias influence change estimates?" Methods To address these questions, I used confirmatory factor analyses for ordered-categorical measures to probe for measurement bias on two large, standardized, nationally representative, US surveys of alcohol abuse conducted in 1992 and 2001–2002. In 2001–2002, analyses investigated whether 10 items operationalizing DSM-IV alcohol abuse provided equivalent measurement across Hispanic (<it>n </it>= 4,893) and non-Hispanic Caucasians (<it>n </it>= 16,480). In 1992, analyses examined whether a reduced 6 item item-set provided equivalent measurement among 834 Hispanic and 14,8335 non-Hispanic Caucasians. Results In 1992, findings demonstrated statistically significant measurement bias for two items. However, sensitivity analyses showed that item-level bias did not appreciably bias item-set based alcohol abuse estimates among this cohort. For 2001–2002, results demonstrated statistically significant bias for seven items, suggesting caution regarding the cross-ethnic equivalence of alcohol abuse estimates among the current US Hispanic population. Sensitivity analyses indicated that item-level differences <it>did </it>erroneously impact alcohol abuse rates in 2001–2002, underestimating rates among Hispanics relative to Caucasians. Conclusion 1992's item-level findings suggest that estimates of drinking related social or legal problems may underestimate these specific problems among Hispanics. However, impact analyses indicated no appreciable effect on alcohol abuse estimates resulting from the item-set. Efforts to monitor change in alcohol abuse diagnoses among the Hispanic community can use 1992 estimates as a valid baseline. In 2001–2002, item-level measurement bias on seven items did affect item-set based estimates. Bias underestimated Hispanics' self-reported alcohol abuse levels relative to non-Hispanic Caucasians. Given the cross-ethnic equivalence of 1992 estimates, bias in 2001–2002 speciously minimizes current increases in drinking behavior evidenced among Hispanics. Findings call for increased public health efforts among the Hispanic community and underscore the necessity for cultural sensitivity when generalizing measures developed in the majority to minorities.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central